Filter runtime guard logs by ready revision#277
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 82f8fb85f9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if revision_start and revision_start > fallback: | ||
| return revision_start |
There was a problem hiding this comment.
Use serving revisions when narrowing the log window
When a service has a latest ready revision that is not receiving 100% of traffic, such as a no-traffic deploy or traffic split, this cutoff is applied to every log for the service even though the query still only filters by service_name. That can hide failures from the currently serving older revision earlier in the lookback window as soon as a newer ready revision exists; base the cutoff/filter on status.traffic serving revisions, or include the resolved revision_name if the guard is intentionally checking only that revision.
Useful? React with 👍 / 👎.
| metadata = revision_payload.get("metadata") or {} | ||
| if not isinstance(metadata, dict): | ||
| return None | ||
| return _parse_timestamp(metadata.get("creationTimestamp")) |
There was a problem hiding this comment.
Cut off at the ready transition time
When the new revision takes time to become Ready, metadata.creationTimestamp is before the Ready/ContainerHealthy transition, so startup or readiness-check errors emitted after creation still pass the timestamp filter. With RUNTIME_GUARD_IGNORE_PRE_READY_REVISION_LOGS enabled this can still alert on transient pre-ready logs for the revision that eventually became ready; use the ready condition transition time as the lower bound, falling back to creation only if that timestamp is unavailable.
Useful? React with 👍 / 👎.
| ) | ||
|
|
||
| for service in services: | ||
| service_since = _cloud_run_log_since(project, service, since) if ignore_pre_ready_logs else since |
There was a problem hiding this comment.
Keep the success check on the configured lookback
When RUNTIME_GUARD_REQUIRE_SUCCESS=true and a new revision became ready inside the lookback window but has not received traffic yet, this same shortened service_since window is used for counting successes. That excludes successful requests earlier in the configured lookback and emits a no successful Cloud Run request ... in the last N minutes alert even though the requirement was satisfied; keep the success query on since or change the alert semantics/message to require a post-ready success.
Useful? React with 👍 / 👎.
Summary
Validation